Language Resource Development at DLSU-NLP Lab

نویسنده

  • Shirley B. Chu
چکیده

In 2003, the Department of Science and Technology awarded a 5million-peso grant to De La Salle University for the development of an EnglishFilipino Machine Translation System. Faced with limited resources for the Filipino language, the team has to build language resources and develop language tools in order to complete the system. This paper presents the different resources and tools created and built by the team and the successive improvements on these projects after 2006.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Survey of NLP Methods and Resources for Analyzing the Collaborative Writing Process in Wikipedia

With the rise of the Web 2.0, participatory and collaborative content production have largely replaced the traditional ways of information sharing and have created the novel genre of collaboratively constructed language resources. A vast untapped potential lies in the dynamic aspects of these resources, which cannot be unleashed with traditional methods designed for static corpora. In this chap...

متن کامل

UIMA-Based JCoRe 2.0 Goes GitHub and Maven Central ― State-of-the-Art Software Resource Engineering and Distribution of NLP Pipelines

We introduce JCORE 2.0, the relaunch of a UIMA-based open software repository for full-scale natural language processing originating from the Jena University Language & Information Engineering (JULIE) Lab. In an attempt to put the new release of JCORE on firm software engineering ground, we uploaded it to GITHUB, a social coding platform, with an underlying source code versioning system and var...

متن کامل

Participation in Language Resource Development and Sharing

Language resources are really much required for understanding and modeling the language in the present approaches. The language that has a rich language resource gains a big benefit in making a big advance in language processing. On the other hand, the less resource language is struggling with preparing a large enough language resource such as raw text or annotated corpora. It is a labor intens...

متن کامل

A Semi-Automatic, Iterative Method for Creating a Domain-Specific Treebank

In this paper we present the development process of NLP-QT, a question treebank that will be used for data-driven parsing in the context of a domain-specific QA system for querying NLP resource metadata. We motivate the need to build NLP-QT as a resource in its own right, by comparing the Penn Treebank-style annotation scheme used for QuestionBank (Judge et al., 2006) with the modified NP annot...

متن کامل

Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP

We introduce JCORE, a full-fledged UIMA-compliant component repository for complex text analytics developed at the Jena University Language & Information Engineering (JULIE) Lab. JCORE is based on a comprehensive type system and a variety of document readers, analysis engines, and CAS consumers. We survey these components and then turn to a discussion of lessons we learnt, with particular empha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009